{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "application/vnd.databricks.v1+cell": {
     "inputWidgets": {},
     "nuid": "0b0b029f-2c03-4b70-b8f9-662395be7282",
     "showTitle": false,
     "title": ""
    }
   },
   "source": [
    "# SparkContext\n",
    "\n",
    "SparkContext：是Spark Application程序的入口。\n",
    "\n",
    "任何一个应用首先需要构建SparkContext对象，如下两步构建：\n",
    "* 创建SparkConf对象\n",
    "  * 设置Spark Application基本信息，比如应用的名称AppName和应用运行Master\n",
    "* 基于SparkConf对象，创建SparkContext对象\n",
    "\n",
    "在spark-shell、pyspark、databricks等这种交互式的环境中，已经默认帮我们创建好了SparkContext，我们可以直接用sc来得到SparkContext。\n",
    "\n",
    "对于我们开发的需要提交到集群运行的代码，则需要我们自己创建SparkContext。\n",
    "\n",
    "具体可参考：[https://spark.apache.org/docs/3.2.1/rdd-programming-guide.html](https://spark.apache.org/docs/3.2.1/rdd-programming-guide.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "application/vnd.databricks.v1+cell": {
     "inputWidgets": {},
     "nuid": "9483eb27-c409-4805-9f21-58458a901ba0",
     "showTitle": false,
     "title": ""
    }
   },
   "outputs": [],
   "source": [
    "# 创建SparkConf对象，设置应用的配置信息，比如应用名称和应用运行模式\n",
    "conf = SparkConf().setAppName(\"app\").setMaster(\"yarn\")\n",
    "# 构建SparkContext上下文实例对象，读取数据和调度Job执行\n",
    "sc = SparkContext(conf=conf)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "application/vnd.databricks.v1+cell": {
     "inputWidgets": {},
     "nuid": "a2f4e76e-cc14-4900-892b-0d0034e43e3a",
     "showTitle": false,
     "title": ""
    }
   },
   "outputs": [],
   "source": [
    "from pyspark import SparkConf, SparkContext\n",
    "\n",
    "if __name__ == '__main__':\n",
    "    masters = [\"local[*]\", \"yarn\"]\n",
    "    # TODO: 当应用运行在集群上的时候，main函数就是Driver，创建SparkContext对象\n",
    "    # 创建SparkConf对象，设置应用的配置信息，比如应用名称和应用运行模式\n",
    "    conf = SparkConf().setAppName(\"app\").setMaster(\"yarn\")\n",
    "    # TODO: 构建SparkContext上下文实例对象，读取数据和调度Job执行\n",
    "    sc = SparkContext(conf=conf)"
   ]
  }
 ],
 "metadata": {
  "application/vnd.databricks.v1+notebook": {
   "dashboards": [],
   "language": "python",
   "notebookMetadata": {},
   "notebookName": "SparkContext",
   "notebookOrigID": 2180139033493835,
   "widgets": {}
  },
  "kernelspec": {
   "display_name": "",
   "name": ""
  },
  "language_info": {
   "name": ""
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
